Using spatial hints to improve policy reuse in a reinforcement learning agent
نویسندگان
چکیده
We study the problem of knowledge reuse by a reinforcement learning agent. We are interested in how an agent can exploit policies that were learned in the past to learn a new task more efficiently in the present. Our approach is to elicit spatial hints from an expert suggesting the world states in which each existing policy should be more relevant to the new task. By using these hints with domain exploration, the agent is able to detect those portions of existing policies that are beneficial to the new task, therefore learning a new policy more efficiently. We call our approach Spatial Hints Policy Reuse (SHPR). Experiments demonstrate the effectiveness and robustness of our method. Our results encourage further study investigating how much more efficacy can be gained from the elicitation of very simple advice from humans.
منابع مشابه
AgentX: Using Reinforcement Learning to Improve the Effectiveness of Intelligent Tutoring Systems
Reinforcement Learning (RL) can be used to train an agent to comply with the needs of a student using an intelligent tutoring system. In this paper, we introduce a method of increasing efficiency by way of customization of the hints provided by a tutoring system, by applying techniques from RL to gain knowledge about the usefulness of hints leading to the exclusion or introduction of other help...
متن کاملProbabilistic Policy Reuse
We contribute Policy Reuse as a technique to improve a reinforcement learner with guidance from past learned similar policies. Our method relies on using the past policies in a novel way as a probabilistic bias where the learner faces three choices: the exploitation of the ongoing learned policy, the exploration of random unexplored actions, and the exploitation of past policies. We introduce t...
متن کاملMulti-Agent Shared Hierarchy Reinforcement Learning
Hierarchical reinforcement learning facilitates faster learning by structuring the policy space, encouraging reuse of subtasks in different contexts, and enabling more effective state abstraction. In this paper, we explore another source of power of hierarchies, namely facilitating sharing of subtask value functions across multiple agents. We show that, when combined with suitable coordination ...
متن کاملUsing Human Demonstrations to Improve Reinforcement Learning
This work introduces Human-Agent Transfer (HAT), an algorithm that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains. Using experiments in a simulated robot soccer domain, we show that human demonstrations transferred into a baseline policy for an agent and refined using reinforcement learning sig...
متن کاملPolicy Reuse for Transfer Learning Across Tasks with Different State and Action Spaces
Policy Reuse is a reinforcement learning method in which learned policies are saved and reused in similar tasks. The policy reuse learner extends its exploration to probabilistically include the exploitation of past policies, with the outcome of significantly improving its learning efficiency. In this paper we demonstrate that Policy Reuse can be applied for transfer learning among tasks in dif...
متن کامل